Search CORE

64 research outputs found

The top five essential modules that may contain multiple potential drug targets.

Author: Daniel J. Hassett (148382)
Long J. Lu (148388)
Minlu Zhang (148373)
Raj K. Bhatnagar (148378)
Shengchang Su (148374)
Publication venue
Publication date
Field of study

In the figure, a diamond node represents a hub protein, and a hexagon node represents a hub and bottleneck protein in the high-confidence network. Larger nodes indicate essential proteins, and smaller ones are non-essential. The majority of module members are hubs and/or bottlenecks in the network, reflecting their essentiality. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g002" target="_blank">Figures 2</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g003" target="_blank">3</a>, and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g004" target="_blank">4</a> were drawn using Cytoscape <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone.0041202-Shannon2" target="_blank">[72]</a>.</p

FigShare

A Statistical Framework for Improving Genomic Annotations of Prokaryotic Essential Genes

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date: 08/03/2013
Field of study

<div>Large-scale systematic analysis of gene essentiality is an important step closer toward unraveling the complex relationship between genotypes and phenotypes. Such analysis cannot be accomplished without unbiased and accurate annotations of essential genes. In current genomic databases, most of the essential gene annotations are derived from whole-genome transposon mutagenesis (TM), the most frequently used experimental approach for determining essential genes in microorganisms under defined conditions. However, there are substantial systematic biases associated with TM experiments. In this study, we developed a novel Poisson model–based statistical framework to simulate the TM insertion process and subsequently correct the experimental biases. We first quantitatively assessed the effects of major factors that potentially influence the accuracy of TM and subsequently incorporated relevant factors into the framework. Through iteratively optimizing parameters, we inferred the actual insertion events occurred and described each gene’s essentiality on probability measure. Evaluated by the definite mapping of essential gene profile in Escherichia coli, our model significantly improved the accuracy of original TM datasets, resulting in more accurate annotations of essential genes. Our method also showed encouraging results in improving subsaturation level TM datasets. To test our model’s broad applicability to other bacteria, we applied it to Pseudomonas aeruginosa PAO1 and Francisella tularensis novicida TM datasets. We validated our predictions by literature as well as allelic exchange experiments in PAO1. Our model was correct on six of the seven tested genes. Remarkably, among all three cases that our predictions contradicted the TM assignments, experimental validations supported our predictions. In summary, our method will be a promising tool in improving genomic annotations of essential genes and enabling large-scale explorations of gene essentiality. Our contribution is timely considering the rapidly increasing essential gene sets. A Webserver has been set up to provide convenient access to this tool. All results and source codes are available for download upon publication at <a href="http://research.cchmc.org/essentialgene/" target="_blank">http://research.cchmc.org/essentialgene/</a>. </div

Directory of Open Access Journals

PubMed Central

FigShare

Prediction and Analysis of the Protein Interactome in Pseudomonas aeruginosa to Enable Network-Based Drug Target Selection

Author: Daniel J. Hassett (148382)
Long J. Lu (148388)
Minlu Zhang (148373)
Raj K. Bhatnagar (148378)
Shengchang Su (148374)
Publication venue
Publication date: 24/07/2012
Field of study

<div>Pseudomonas aeruginosa (PA) is a ubiquitous opportunistic pathogen that is capable of causing highly problematic, chronic infections in cystic fibrosis and chronic obstructive pulmonary disease patients. With the increased prevalence of multi-drug resistant PA, the conventional “one gene, one drug, one disease” paradigm is losing effectiveness. Network pharmacology, on the other hand, may hold the promise of discovering new drug targets to treat a variety of PA infections. However, given the urgent need for novel drug target discovery, a PA protein-protein interaction (PPI) network of high accuracy and coverage, has not yet been constructed. In this study, we predicted a genome-scale PPI network of PA by integrating various genomic features of PA proteins/genes by a machine learning-based approach. A total of 54,107 interactions covering 4,181 proteins in PA were predicted. A high-confidence network combining predicted high-confidence interactions, a reference set and verified interactions that consist of 3,343 proteins and 19,416 potential interactions was further assembled and analyzed. The predicted interactome network from this study is the first large-scale PPI network in PA with significant coverage and high accuracy. Subsequent analysis, including validations based on existing small-scale PPI data and the network structure comparison with other model organisms, shows the validity of the predicted PPI network. Potential drug targets were identified and prioritized based on their essentiality and topological importance in the high-confidence network. Host-pathogen protein interactions between human and PA were further extracted and analyzed. In addition, case studies were performed on protein interactions regarding anti-sigma factor MucA, negative periplasmic alginate regulator MucB, and the transcriptional regulator RhlR. A web server to access the predicted PPI dataset is available at <a href="http://research.cchmc.org/PPIdatabase/">http://research.cchmc.org/PPIdatabase/</a>. </div

Directory of Open Access Journals

PubMed Central

FigShare

Illustration of the statistical model.

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date
Field of study

In a TM experiment, if a gene has no observed insertions, meaning it is TM essential or TmEs, what could it be? There are two possibilities: (1) Part A: It never had any insertion and was missed by all transposons by chance. This means we do not have useful information to infer what this gene could be, and it is completely blind for us. For any blind gene, we can only try our best guess and assume that the chance of that gene to be essential is equal to the overall essential gene rate (Pr(overall essential)), and that a gene to be non-essential is equal to = 1-. (2) Part B: It actually had insertions, but all inserted mutations died. This means that this gene is truly essential. In this way, we can now split the TM assigned essential genes into two parts, TETmE and FETmE. Similarly, if in the TM experiment, a gene is observed to have insertions, meaning it is TM nonessential, what could it really be? There are also two possibilities: (1) Part C: All these observed insertions are ineffective, and did not interrupt the gene function. This means again we are blind about this gene. So it has a certain chance to be essential , and also has a certain chance to be nonessential . (2) Part D: There was at least one effective insertion, and it did interrupt the gene function. . This means this gene is truly non-essential.</p

FigShare

Three factors have strong associations with false TM assignments.

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date
Field of study

(A) Gene length. The lengths of TmEs are significantly shorter than those in the PEC dataset and total genes. Many of these short genes may be false essential genes. (B) Position of insertions. Essential genes mistakenly assigned to be non-essential by TM often have insertions in the 25% extreme-ends (5% in 5′ end and 20% in 3′ end). These insertions do not completely disrupt a gene’s function. (C) Number of insertions. 75% of the essential genes mistakenly assigned to be non-essential by TM only have one insertion in them.</p

FigShare

Improvement of overlaps with the PEC dataset using our model.

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date
Field of study

Improvement of overlaps with the PEC dataset using our model.</p

FigShare

A level-1 interaction map for MucA and MucB.

Author: Daniel J. Hassett (148382)
Long J. Lu (148388)
Minlu Zhang (148373)
Raj K. Bhatnagar (148378)
Shengchang Su (148374)
Publication venue
Publication date
Field of study

Each node is a protein and each edge is a predicted PPI from the high-confidence network (except the interaction MucA-AlgW, which comes from experimental PPI data). A total of 39 proteins and 199 interactions were captured by the level-1 PPI network for MucA and MucB. 17 Red nodes are essential proteins. Yellow edges indicate high confidence interactions included in the high-confidence network.</p

FigShare

Performance of the random forest classifier for the positive class in 10-fold cross-validation.

Author: Daniel J. Hassett (148382)
Long J. Lu (148388)
Minlu Zhang (148373)
Raj K. Bhatnagar (148378)
Shengchang Su (148374)
Publication venue
Publication date
Field of study

Performance of the random forest classifier for the positive class in 10-fold cross-validation.</p

FigShare

Enrichment of true essential genes using different thresholds of the confidence score.

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date
Field of study

Enrichment of true essential genes using different thresholds of the confidence score.</p

FigShare

Validation using allelic exchange experiments in Pseudomonas aeruginosa PAO1. E – Essential; N – Non-essential.

Author: Daniel J. Hassett (148382)
Jingyuan Deng (250051)
Long Jason Lu (200790)
Shengchang Su (148374)
Xiaodong Lin (387188)
Publication venue
Publication date
Field of study

Validation using allelic exchange experiments in Pseudomonas aeruginosa PAO1. E – Essential; N – Non-essential.</p

FigShare

The top five essential modules that may contain multiple potential drug targets.

A Statistical Framework for Improving Genomic Annotations of Prokaryotic Essential Genes

Prediction and Analysis of the Protein Interactome in <em>Pseudomonas aeruginosa</em> to Enable Network-Based Drug Target Selection

Illustration of the statistical model.

Three factors have strong associations with false TM assignments.

Improvement of overlaps with the PEC dataset using our model.

A level-1 interaction map for MucA and MucB.

Performance of the random forest classifier for the positive class in 10-fold cross-validation.

Enrichment of true essential genes using different thresholds of the confidence score.

Validation using allelic exchange experiments in <i>Pseudomonas aeruginosa PAO1</i>. E – Essential; N – Non-essential.